skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Phillips, Joshua L"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Shifts in agricultural land use over the past 200 years have led to a loss of nearly 50% of existing wetlands in the USA, and agricultural activities contribute up to 65% of the nutrients that reach the Mississippi River Basin, directly contributing to biological disasters such as the hypoxic Gulf of Mexico “Dead” Zone. Federal efforts to construct and restore wetland habitats have been employed to mitigate the detrimental effects of eutrophication, with an emphasis on the restoration of ecosystem services such as nutrient cycling and retention. Soil microbial assemblages drive biogeochemical cycles and offer a unique and sensitive framework for the accurate evaluation, restoration, and management of ecosystem services. The purpose of this study was to elucidate patterns of soil bacteria within and among wetlands by developing diversity profiles from high-throughput sequencing data, link functional gene copy number of nitrogen cycling genes to measured nutrient flux rates collected from flow-through incubation cores, and predict nutrient flux using microbial assemblage composition. Soil microbial assemblages showed fine-scale turnover in soil cores collected across the topsoil horizon (0–5 cm; top vs bottom partitions) and were structured by restoration practices on the easements (tree planting, shallow water, remnant forest). Connections between soil assemblage composition, functional gene copy number, and nutrient flux rates show the potential for soil bacterial assemblages to be used as bioindicators for nutrient cycling on the landscape. In addition, the predictive accuracy of flux rates was improved when implementing deep learning models that paired connected samples across time. 
    more » « less
    Free, publicly-accessible full text available December 1, 2026
  2. High-throughput sequencing (HTS) is a modern DNA sequencing technology used to rapidly read thousands of genomic fragments from microorganisms given a sample. The large amount of data produced by this process makes deep learning, whose performance often scales with dataset size, a suitable fit for processing HTS samples. While deep learning models have utilized sets of DNA sequences to make informed predictions, to our knowledge, there are no models in the current literature capable of generating synthetic HTS samples, a tool which could enable experimenters to predict HTS samples given some environmental parameters. Furthermore, the unordered nature of HTS samples poses a challenge to nearly all deep learning architectures because they have an inherent dependence on input order. To address this gap in the literature, we introduce DNA Generative Adversarial Set Transformer (DNAGAST), the first model capable of generating synthetic HTS samples.We qualitatively and quantitatively demonstrate DNAGAST’s ability to produce realistic synthetic samples and explore various methods to mitigate mode-collapse. Additionally, we propose novel quantitative diversity metrics to measure the effects of mode-collapse for unstructured set-based data. 
    more » « less
    Free, publicly-accessible full text available May 14, 2026
  3. Transformers, although first designed for sequence processing, can also handle unordered sets like point cloud data. Additionally, contrastive pretraining has emerged as a successful technique in image processing but remains unexplored for point cloud data. We develop and integrate a new point cloud pretraining technique inspired by the Simple Framework for Contrastive Learning (SimCLR) into the Set Transformer (ST) and Point Cloud Transformer (PCT) architectures and explore model performance using a novel 3D body scan dataset and the canonical datasets ShapeNet and ModelNet. For the 3D body scan dataset, this integration boosts initial training performance and maintains overall higher performance for classification tasks, and demonstrates better stability/convergence for regression tasks in comparison to non-pretrained (Naïve] counterparts. Furthermore, experiments examining strong generalization (relative performance on previously unseen classes) show improvement for pretrained models compared to Naïve models. Consistent benefits across tasks and data sets are observed based on additional experiments performed on the ShapeNet core dataset. Overall, we show how contrastive pretraining for point cloud data is a viable strategy for improving the performance of Transformers on downstream tasks and accelerating the training process. 
    more » « less
  4. Current deep-learning techniques for processing sets are limited to a fixed cardinality, causing a steep increase in computational complexity when the set is large. To address this, we have taken techniques used to model long-term dependencies from natural language processing and combined them with the permutation equivariant architecture, Set Transformer (STr). The result is Set Transformer XL (STrXL), a novel deep learning model capable of extending to sets of arbitrary cardinality given fixed computing resources. STrXL's extension capability lies in its recurrent architecture. Rather than processing the entire set at once, STrXL processes only a portion of the set at a time and uses a memory mechanism to provide additional input from the past. STrXL is particularly applicable to processing sets of high-throughput sequencing (HTS) samples of DNA sequences as their set sizes can range into hundreds of thousands. When tasked with classifying HTS prairie soil samples and MNIST digits, results show that STrXL exhibits an expected memory size-accuracy trade-off that scales proportionally with the complexity of downstream tasks, but, unlike STr, is capable of generalizing to sets of arbitrary cardinality. 
    more » « less
  5. Current deep-learning techniques for processing sets are limited to a fixed cardinality, causing a steep increase in computational complexity when the set is large. To address this, we have taken techniques used to model long-term dependencies from natural language processing and combined them with the permutation equivariant architecture, Set Transformer (STr). The result is Set Transformer XL (STrXL), a novel deep learning model capable of extending to sets of arbitrary cardinality given fixed computing resources. STrXL’s extension capability lies in its recurrent architecture. Rather than processing the entire set at once, STrXL processes only a portion of the set at a time and uses a memory mechanism to provide additional input from the past. STrXL is particularly applicable to processing sets of highthroughput sequencing (HTS) samples of DNA sequences as their set sizes can range into hundreds of thousands. When tasked with classifying HTS prairie soil samples and MNIST digits, results show that STrXL exhibits an expected memory size-accuracy trade-off that scales proportionally with the complexity of downstream tasks, but, unlike STr, is capable of generalizing to sets of arbitrary cardinality. 
    more » « less
  6. The human ability to generalize beyond interpolation, often called extrapolation or symbol-binding, is challenging to recreate with computational models. Biologically plausible models incorporating indirection mechanisms have demonstrated strong performance in this regard. Deep learning approaches such as Long Short-Term Memory (LSTM) and Transformers have shown varying degrees of success, but recent work has suggested that Transformers are capable of extrapolation as well. We evaluate the capabilities of the above approaches on a series of increasingly complex sentence-processing tasks to infer the capacity of each individual architecture to extrapolate sentential roles across novel word fillers. We confirm that the Transformer does possess superior abstraction capabilities compared to LSTM. However, what it does not possess is extrapolation capabilities, as evidenced by clear performance disparities on novel filler tasks as compared to working memory-based indirection models. 
    more » « less
  7. In this Work in Progress, we present a progress report from the first two years of a five-year Scholarships in STEM program. The number of graduates with computing related degrees from colleges and universities, especially female and underrepresented minorities (URM), is too small to keep up with the fast-growing demand for IT professionals across nation and Tennessee specifically. To reduce the gap in the Tennessee region, our university launched a 5-year S-STEM Scholarship program in 2018 to recruit and graduate more computer science students, especially female and URM. The scholarship program supports about 20 qualified Pell-eligible students every year. Each recipient receives an annual stipend of up to $6000 for no more than three years. In order to increase their interest in computer science and to improve retention of CS majors, a pipeline of well-proven activities were integrated into the program to inspire exploration of the CS discipline and computing careers at an early stage and help students gain work experience before graduation. These activities include, but are not limited to: summer research program that provides opportunities for students to conduct research in different computer science areas, peer-mentoring program that leverages experience and expertise of the group of CS majors who work in the computing field to better prepare scholarship recipients for their careers, and professional conference attendance program that sends students to professional conferences to explore computer science careers and build their own networks. The preliminary data suggest that these activities had a positive effect on our students. We find that the financial support allows students to focus on both academics and searching for computing-related employment. Early analysis of institutional data shows that scholars take more CS credit hours and achieve a higher GPA than other Pell-eligible and non-Pell eligible students, thus making faster progress toward their degree. The support to attend in-person conferences and summer research opportunities had a transformative impact on many participating scholars. The original mentoring program was less effective and has been redesigned to include higher expectations for mentors and mentees and increased faculty involvement. This paper will describe the program elements and explain the effects of these activities on our students with preliminary outcome data and formative evaluation results about the program 
    more » « less